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Abstract — In practice, since many communication networks 
are liuge in scale or complicated in structure even dynamic, the 
predesigned network codes based on the network topology is 
impossible even if the topological structure is known. Therefore, 
random linear network coding was proposed as an acceptable 
coding technique. In this paper, we further study the performance 
of random linear network coding by analyzing the failure proba- 
bilities at sink node for different knowledge of network topology 
and get some tight and asymptotically tight upper bounds of the 
failure probabilities. In particular, the worst cases are indicated 
for these bounds. Furthermore, if the more information about 
the network topology is utilized, the better upper bounds are 
obtained. These bounds improve on the known ones. Finally, we 
also discuss the lower bound of this failure probability and show 
that it is also asymptotically tight. 

I. Introduction 

Network coding was first introduced by Yeung and Zhang 
in m and then profoundly developed in Ahlswede et al. 
in (|2]- In the latter paper IJ), the authors showed that if 
coding is applied at the nodes instead of routing alone, the 
source node can multicast the information to all sink nodes 
at the theoretically maximum rate. Li et al. 13] indicated that 
linear network coding with finite alphabet size is sufficient for 
multicast. In [4|, Koetter and Medard presented an algebraic 
characterization of network coding. Although network coding 
allows the higher information rate than classical routing, 
Jaggi et al. ID still proposed a deterministic polynomial-time 
algorithm to construct a linear network code. For a detail and 
comprehensive discussion of network coding, refer to fS], Q, 
il, 191, and Ho). 

Random linear network coding was originally proposed and 
analyzed in the papers Ho et al. fTT) and fT2], where the 
main results are upper bounds for failure probabilities of the 
code. Balli, Yan, and Zhang [13] improved on these bounds 
and the tightness of the new bounds was studied by analyzing 
the asymptotic behavior of the failure probability as the field 
size goes to infinity. However, the upper bounds of failure 
probabilities proposed either by Ho et al. |fT2l or by Balli 
et al. 1 13 1 are not tight. In this paper, we further study the 
random linear network coding and improve on the bounds of 
the failure probabilities for different cases. In particular, if the 
more knowledge about the topology of the network is known, 
we can get the better bounds. Further, we indicate that these 
bounds are either tight or asymptotically tight. 



II. Linear Network Coding and Preliminaries 

A communication network is defined as a finite acyclic 
directed graph G = {V, E), where the vertex set V stands 
for the set of the nodes and the edge set E represents the 
set of communication channels of the network. The nodes set 
V consists of three disjoint subsets S, T, and J, where S is 
the set of source nodes, T is the set of sink nodes, the other 
nodes in J ~ V — S ~ T sae called internal nodes and thus 
the subset J is called the set of internal nodes. A direct edge 
6 ~ ihj) G E represents a channel leading from node i to 
node j. Node i is called the tail of the channel e, node j 
is called the head of the channel e, and they are written as 
i = tail{e), j = head{e), respectively. Correspondingly, the 
channel e is called an outgoing channel of i and an incoming 
channel of j. For each node i, define 

Out{i) — {e £ E : e is an outgoing channel of 
In{i) ^ {e E E : e is an incoming channel of i}. 

For each channel e E E, there exists a positive number Re 
called the capacity of the channel e. We allow the multiple 
channels between two nodes and then assume reasonably that 
all capacity of the channel is unit 1 . That is, one field symbol 
can be transmitted over a channel in one unit time. The source 
nodes generate messages and ttansmit them to all sink nodes 
over the network by network coding. 

In this paper, we sequentially consider single source mul- 
ticast networks, i.e. \S\ = 1, and the unique source node is 
denoted by s. The source node s has no incoming channels 
and any sink node has no outgoing channels, but we use the 
concept of the imaginary incoming channels of the source 
node s and assume that these imaginary channels provide 
the source messages to s. Let the information rate be w 
symbols per unit time which means that the source node 
s has w imaginary incoming channels di , ^2 , • • • , c^iu and 
let In{s) = {di,d2,--- jd^,}. The source messages are w 
symbols X = {Xi, X2, ■ ■ ■ jX^^) arranged in a row vector 
where each Xi is an element of the finite base field J^. 
Assume that they are transmitted to s through the w imaginary 
channels. Using network coding, these messages are multicast 
to each sink node and decoded at each sink node. 

We use Ue to denote the message transmitted over channel 



e = ihj) and Ue is calculated by the following formula 

Ue^ ^ kd.eUd , 
d£ln{i) 

where at the source node s, assume that the message transmit- 
ted over ith imaginary channel di is the ith source message, 
i.e. — Xi. And, by the definition of the global kernels of 
the channel e, we have Ue = X • /e- 

The linear network coding discussed above was designed 
based on the global topology of the network. However, in 
most communication networks, we cannot utilize the global 
topology because the network is huge in scale, or complicated 
in structure, even dynamic, or some another reasons. In other 
words, it is impossible to use the predesigned codes based on 
the global topology. Thus random linear network coding was 
proposed as an acceptable coding technique. The main idea of 
random network coding is that when a node (may be the source 
node s) receives the messages from its all incoming channels, 
for each outgoing channel, it randomly and uniformly picks 
the encoding coefficients from the base field F, uses them 
to encode the messages, and transmits the encoded messages 
over the outgoing channel. In other words, the local coding 
coefficients kd.e are independently and uniformly distributed 
random variables in the base field F. Since random linear 
network coding does not consider the network global topology 
or does not coordinate codings at different nodes, it may 
not achieve the best possible performance of network coding, 
that is, some sink nodes may not decode correctly. Therefore, 
the performance analysis of random linear network coding is 
important in theory and application. 

Before further discussion, we introduce some notation and 
definitions as follows. 

Let A be a set of vectors from a linear space. {A) represents 
a linear subspace spanned by the vectors in A. In addition, we 
give the definition of the failure probability at sink node which 
was introduced exactly in [ 13] . 

Definition 1: Let G be a single source multicast network, 
and the information rate be w symbols per unit time. Pe^ = 
Pr(Rank{Ft) < w) is called the failure probability of the 
random linear network coding at sink node t, that is the prob- 
ability that the source messages cannot be decoded correctly 
at sink node t ^ T. 

III. Failure Probabilities of Random Linear 
Network Coding at Sink Node 

We have known that the performance analysis of random 
linear network coding is very important in theory and ap- 
plication. In particular, the random linear network coding is 
an acceptable coding technique for non-coherent networks. 
However, many coherent networks are huge and complicated, 
and thus the random linear network coding are often used for 
the coherent networks. In this section, we study the failure 
probability P^^ from coherent to non-coherent networks. At 
first, we give the following lemma. 

Lemma 1: Let £ be a n-dimensional linear space over finite 
field F, Cq, L\ be two subspaces of L of dimensions fcoi ^i^ 



respectively, and (£o U L\] = C. Let li, I2, • • • , (m = 
n — fco) be m independently uniformly distributed random 
vectors taking values in Ci. Then 

n-ko / \ 

Pr(dim((£oU{Zi, h, ■ ■ ■ , Im})) = n) = [] f 1 - — j . 
Therefore, 

^ < Pr{dim{{Co U {h, h, • • • , l,n})) <n)< J^^—^- 

Remark 1: We can observe that under the condition of 
Lemma[ri Pr(dim((£oU{/i, ^2, • • • , 'm})) — n) is not related 
to the dimension of £1. 

Let G be a single source multicast network, where the single 
source node is denoted by s, the set of the sink nodes is 
denoted by T, and the minimum cut capacity between s and 
t & T is Cf. The information rate is w < mint^T Ct symbols 
per unit time. 

For each sink node t E T, since w < Ct and Menger's The- 
orem, there exist w channel-disjoint paths from s to t. Let the 
arbitrarily chosen w channel-disjoint paths from s to the Pt ~ 
{Pt, 1, Pt, 2, Pt, w} and let Pt^i = {ei,i, 6^,2, • • • , e^,™ J sat- 
isfying tail{ei,i) — s, head{ei_mi) ~ t, and head{ei_j^i) ~ 
tail{ei,j) for others. The set of all channels in Vt is denoted 
by E-p^ . Furthermore, assume that the number of the nodes in 
Vt is r + 2, where one is the source node s, one is the sink 
node t, and another r are internal nodes, which are denoted 
by ii, 12, ■ • • ,ir- There is a topological order ancestrally, and 
without loss of generality, let the order be 

S = io ~< il -< 12 ~< ■ ■ ■ -< ir ~< ir+i — t ■ 

During our discussion, we use the concept of cuts of the 
paths from s to t proposed in [13|, which is different from the 
concept of cuts of the network in graph theory. The first cut 
is CUTt.o — In{s), i.e. the set of the w imaginary channels. 
Through the node iq = s, the next cut CUTt.i is the set of the 
first channels of all w paths, i.e. CUTt^i = {e^.i : 1 < i < 
w}. Through the node ii, the next cut CUTt.2 is formed from 
CUTt^i by replacing those channels in In{ii) fl CUTt^i by 
their respective next channels in the paths. These new channels 
are in Out{ii) fl E-p^. Other channels remain the same as 
in CUTt^i. Subsequently, once CUTt,k is defined, CUTt,k+i 
is formed from CUTt,k by the same method as above. By 
induction, all cuts CUTt^k for k — 0, 1, ■ • • ,r + 1 can be 
defined. Furthermore, for each CUTt^k, we divide CUTt^k 
into two disjoint parts C'UT° f and CUTl\, where 

GC/T°f = {e : e £ CUTtM \ In{ik)}, 
CUTll = {e : e e CUTt,k n In{ik)}. 

Theorem 2: For this network G mentioned as above, the 
failure probability of random linear network coding at sink 
node t G T satisfies 

r w~\CUT^f\ / s 

^..si-n n ■ 

fc=0 i=l \ Kl / 



Proof: For sink node t E T, the decoding matrix 
Ft — (/e : e G In{t)) is a w x |/n(<)| matrix over the field T. 
Define a w x w matrix F/ = (/ei.„ ^ , /e2.,„2 , ' • ' , /e„,„„ ) ■ It 
is not hard to see that F/ is a submatrix of Fj. It follows that 
the event " Rank(F'f) < it; " C the event " Rank(F/) < w ". 
This means that 

Pr(Rank(Ft) < w) < Fr(Rank(F;) < w) . 

Further define w x w matrices f/*^' = (/e : e e CUTt.k) 
for fc = 0, 1, • • • ,r + 1. If Rank(F/''^) < w, we call that 
we have a failure at CUTt^k- we use Tt,k to denote the 
event "Rank(F(^'^'' ) = w". Obviously, F^^^'^^ — F/ because 
CUTt,r+i = {ei ,mi, e2,m2j ' ■ ' : CM),m„}- This implies 



Fr(Rank(F;) < w) = Fr(Rank(F/''+^^) < w) 

^Pr{{Tt^r+iY) = 1 - Pr{Tt,r+i)- 

In addition, since encoding at any node is independent of what 
happened before this node as long as no failure has occurred 
up to this node, we have 

Pr{Tt^r+l) > Pr{Tt,r+iTt,r ■ ■ ■ TtsT^t,o) 

^Pr{rt^r+i\rt,r)Pr{rt,r\rur-i) ■ ■ ■ Pr{rt,i\rt^o)Pr{rt,o) 

^Pr{Tt,r+i\Tt,r)Pr{Tt,r\rt^r-i) ' ' ' Pr{Tts\rt.o) (1) 

where ([T]l follows because Fr(ri o) — Fr(Rank((/e : e G 
In{s))) ^ w) ^ Pr{Rank{I^xw) = w) = 1 with /^x«) 
being w x w identity matrix. 

Therefore, applying Lemma [T] for each fc (0 < fc < r), we 
have 

w-\CUT°l*\ 

Pr{rt,k+i\Tt,k) ^ TT I (2) 




where under the condition Ff k, there must be \CUT°'^*^ 
dim(({/, : e e Cf/T™*})) 



Combining ([T]l and Q, it follows that 



Rank((/e : e G CUT°f)). 



w-lCUT," 



That is, we get the upper bound of the failure probability at 
the sink node t. 



Pe, < 1 



n n ■ 

fc=0 1=1 \ K I / 
The proof is completed. ■ 
Remark 2: This upper bound of the failure probability at 
the sink node t in Theorem |2] is tight. 

Example 1: For the well-known butterfly network, by The- 
orem |2] we know 



i=l 



On the other hand, Guang and Fu lfT4l have shown that for the 
butterfly network Fe* = 1 - (|^| + - /\F\'' . This 

means that this upper bound is tight for the butterfly network. 



However, this upper bound is too complicated in practice. 
Thus, we have to give a simpler in form but looser upper 
bound. 

Theorem 3: For this network G, the failure probability of 
the random linear network coding at sink node t G T satisfies 



Pe, < 1 



.1=1 



In particular, if we choose the w channel-disjoint paths 
with the minimum number of the internal nodes among the 
collection of all w channel-disjoint paths from s to t over 
network G, and denote this minimum number by Rt, then we 
get a smaller upper bound with the same simple form. 

Corollary 4: For this network G, the failure probability of 
the random linear network coding at sink node t E T satisfies 



P. . < 1 



n 

,i=i 



1 



Rt + 1 



Remark 3: Both upper bounds of the failure probability at 
the sink node in Theorem |3] and Corollary |4] are tight, and we 
can show the tightness by the same way. Therefore, we only 
construct a network to show the tightness of the upper bound 
in Theorem |3] In other words, we will give a network as the 
worst case. 

Example 2: For the given information rate w, the network 



w channels w channels 



w channels 




Fig. 1. Plait Network witli r internal nodes 

Gi shown by FiglT] can be constructed as follows. Let the 
source node be s, the sink node be t, the number of the internal 
nodes be r, and denote these internal nodes by ii, 12, • ■ • , ir- 
Let the topological order of all nodes be 

s ^ ii ^ i2 ~< ■ ■ ■ ^ ir ^ t . 

Draw w parallel channels from s to ii, w parallel channels 
from ii to 12, in succession, w parallel channels from v to t. 
The total (r + l)w channels are all channels of the network 
Gi. For this type of networks, we call them plait networks. 
For this constructed network Gi, we will show that the failure 
probability P^^ at sink node t is 



Pe, = l 



-1 r+l 



It is not difficult to see that the event "Rank(Ft) < 



w 



is equivalent to the event "Rank(Fj ) < w" because of 



Ft = F 



F,, 



(r+l) 



This implies 

Fr(Rank(F/'^+^') < w) 



1 - PriTt,r+i)- 



Furthermore, for Gi, 

Pr{Tt,r+i) = Pr{Tt^r+iTt,r ■ ■ ■ Tt,iTt,o) 
=Pr{Tt,r+i\Tt,r)Pr{Tt,r\Tt,r-i) ■ ■ ■ Pr{Tt,i\Tt,o). 

And, for any fc = 0, 1, • • ■ , r, 

Pr{Tt^k+i\Tt,k) 

=Pr(/ej.^ ^ (0))/efc,2 ^ (/efc,i)7 /efc_3 ^ {fek,iifek,2)''' ' 
fek,w ^ ({/efc.ii ■■■ ) /efc,„-i})) 

Out{ik) = {efc,i, efc,2, • • • , eft,,„} and is 



=n 

i=l 



1 



where 
a zero vector. 

Combining the above, we get 



Pr{Tt,r+i) 



n 

U=l 



r+1 



that is. 



Pe. = 1 - J^r-ir^.^+i) = 1 - 



n 



1 



r+1 



This means that the upper bound of the failure probability at 
the sink node is tight, and the type of plaint networks is the 
worst case. 

As mentioned above, sometimes, it is hard to use the pre- 
designed linear network coding based on the network topology 
even through the topology of the network is known. But 
usually we still can get some information about the network 
topology more or less. For instance, we can know the number 
of the internal nodes | J| at least. In these cases, we also can 
analyze the performance of the random Unear network coding. 

Theorem 5: Let G be a single source multicast network. 
Using the random linear network coding, the failure probabil- 
ity at the sink node t & T satisfies 



Pe,<l- 



Remark 4: This upper bound is still tight and we can also 
give an example to indicate the tightness. 

Example 3: For a given information rate w, construct a plait 
network G2, where the unique source node is s, the sink 
node is t, and all internal nodes are i\,i2, ■ ■ ■ Let the 
topological order of all nodes be 

s = io -< ii -< i2 -< ■ ■ ■ -< i\j\ -< Vl+i ~ * • 

There are w parallel channels from ij to ij+i, < j < \ J\. 
Similar to the example above, we obtain that the failure 
probabiUty Pe* for plait network G2 is 



IV. The Lower Bounds of The Failure 
Probabilities 

In the last section, we give some upper bounds of the failure 
probability at sink node in order to analyze performance of 
random linear network coding. In this section, we give the 
lower bound of this failure probability. 

Theorem 6: For a single source multicast network G, using 
random linear network coding, the failure probability at the 
sink node satisfies Pg^ > where dt = Ct — w. 

Remark 5: The lower bound in this theorem is also asymp- 
totically tight. 

V. Conclusion 

The performance of random linear network coding is im- 
portant for theory and application. In the present paper, we 
further analyze the upper bounds of failure probability at sink 
node. In particular, the more information about the network 
topology is utilized, the better upper bounds are obtained. We 
further discuss the lower bound of this failure probabiUty and 
indicate that it is also asymptotically tight. 

In addition, other probabilities, such as failure probability 
for network and average failure probability, can also be defined 
to characterize the performance of random linear network 
coding. We have also analyzed these probabilities. But due 
to limited pages, we omit them. 
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